# DPO Optimization
Zhi Writing Dsr1 14b
Apache-2.0
A creative writing enhancement model fine-tuned and optimized based on DeepSeek-R1-Distill-Qwen-14B, showing significant improvements in creative writing
Large Language Model
Transformers Supports Multiple Languages

Z
Zhihu-ai
133
16
Slam
MIT
This is a speech language model based on discrete Hubert tokens, focusing on efficient training and capable of generating speech segment continuations.
Audio Generation
Transformers

S
slprl
115
10
Summllama3.1 8B
SummLlama3.1-8B is a text summarization model initialized from Llama3.1-8B-Instruct, optimized through large-scale summarization feedback via Direct Preference Optimization (DPO), excelling in fidelity, completeness, and conciseness.
Text Generation
Transformers

S
DISLab
116
10
UNA ThePitbull 21.4B V2
UNA-ThePitbull-21.4B-v2 is a large language model based on 21.4B parameters, with performance close to 70B models, integrating emotional intelligence and IQ, excelling in dialogue and text generation.
Large Language Model
Transformers

U
fblgit
16
16
Llama3 OpenBioLLM 70B
OpenBioLLM-70B is an advanced open-source language model specifically designed for the biomedical field, fine-tuned based on Meta-Llama-3-70B-Instruct, demonstrating outstanding performance in biomedical tasks.
Large Language Model
Transformers Supports Multiple Languages

L
aaditya
18.35k
428
Sambalingo Hungarian Chat
A human-preference-aligned chat model supporting Hungarian and English, adapted from Llama-2-7b for Hungarian
Large Language Model
Transformers Supports Multiple Languages

S
sambanovasystems
154
43
Llava V1.5 13b Dpo Gguf
LLaVA-v1.5-13B-DPO is a vision-language model based on the LLaVA framework, trained with Direct Preference Optimization (DPO) and converted to GGUF quantized format to improve inference efficiency.
Image-to-Text
L
antiven0m
30
0
Bloom 1b1 Zh Error Correction Dpo
A Chinese text error correction model trained with DPO, capable of automatically detecting and correcting spelling and grammar errors in Chinese text.
Large Language Model
Transformers Chinese

B
p208p2002
15
1
UNA TheBeagle 7b V1
TheBeagle is a 7-billion-parameter model trained on The Bagel dataset, optimized with DPO (Direct Preference Optimization) and UNA (Unified Neural Architecture) techniques, demonstrating excellent performance in multi-task scenarios.
Large Language Model
Transformers

U
fblgit
88
37
Rocket 3B
Rocket-3B is a 3-billion-parameter large language model trained on public datasets through Direct Preference Optimization (DPO), outperforming many larger-scale models.
Large Language Model
Transformers English

R
pansophic
26
85
Featured Recommended AI Models